Goto

Collaborating Authors

 closure condition


Convergence of Policy Mirror Descent Beyond Compatible Function Approximation

arXiv.org Machine Learning

Modern policy optimization methods roughly follow the policy mirror descent (PMD) algorithmic template, for which there are by now numerous theoretical convergence results. However, most of these either target tabular environments, or can be applied effectively only when the class of policies being optimized over satisfies strong closure conditions, which is typically not the case when working with parametric policy classes in large-scale environments. In this work, we develop a theoretical framework for PMD for general policy classes where we replace the closure conditions with a strictly weaker variational gradient dominance assumption, and obtain upper bounds on the rate of convergence to the best-in-class policy. Our main result leverages a novel notion of smoothness with respect to a local norm induced by the occupancy measure of the current policy, and casts PMD as a particular instance of smooth non-convex optimization in non-Euclidean space.


Forward Kinematics of Object Transporting by a Multi-Robot System with a Deformable Sheet

arXiv.org Artificial Intelligence

We present object handling and transporting by a multi-robot team with a deformable sheet as a carrier. Due to the deformability of the sheet and the high dimension of the whole system, it is challenging to clearly describe all the possible positions of the object on the sheet for a given formation of the multi-robot system. A complete forward kinematics (FK) method is proposed for object handling by an $N$-mobile robot team with a deformable sheet. Based on the virtual variable cables model (VVCM), a constrained quadratic problem (CQP) is formulated by combining the geometric constraints and minimum potential energy conditions of the system. Analytical solutions to the CQP are presented and then further verified with the force closure condition. We present an FK algorithm based on the FK method to obtain all possible solutions with the given initial sheet shape and the robot team formation. We demonstrate the effectiveness, completeness, and efficiency of the FK algorithm with experimental results and case study examples.


Refining Labelled Systems for Modal and Constructive Logics with Applications

arXiv.org Artificial Intelligence

This thesis introduces the "method of structural refinement", which serves as a means of transforming the relational semantics of a modal and/or constructive logic into an 'economical' proof system by connecting two proof-theoretic paradigms: labelled and nested sequent calculi. The formalism of labelled sequents has been successful in that cut-free calculi in possession of desirable proof-theoretic properties can be automatically generated for large classes of logics. Despite these qualities, labelled systems make use of a complicated syntax that explicitly incorporates the semantics of the associated logic, and such systems typically violate the subformula property to a high degree. By contrast, nested sequent calculi employ a simpler syntax and adhere to a strict reading of the subformula property, making such systems useful in the design of automated reasoning algorithms. However, the downside of the nested sequent paradigm is that a general theory concerning the automated construction of such calculi (as in the labelled setting) is essentially absent, meaning that the construction of nested systems and the confirmation of their properties is usually done on a case-by-case basis. The refinement method connects both paradigms in a fruitful way, by transforming labelled systems into nested (or, refined labelled) systems with the properties of the former preserved throughout the transformation process. To demonstrate the method of refinement and some of its applications, we consider grammar logics, first-order intuitionistic logics, and deontic STIT logics. The introduced refined labelled calculi will be used to provide the first proof-search algorithms for deontic STIT logics. Furthermore, we employ our refined labelled calculi for grammar logics to show that every logic in the class possesses the effective Lyndon interpolation property.


Rational Interaction in Dialogues: Ingredients for Success)

AAAI Conferences

In this paper, we discuss the question of closure conditions for dialogues in three different frameworks: W. C. Mann's DMT framework, Vanderveken's illocutionary theory of discourse and Asher and Lascarides SDRT approach. We are interested in formal frameworks that aim to describe the logical structure of conversations between diversely bounded agents who are — to some extent — rational, intelligent, linguistically competent and who possess some awareness of their environment and some knowledge of the circumstances of their interactions. We use the notion of closure conditions as a benchmark for theory comparison.


Integrating a Closed World Planner with an Open World Robot: A Case Study

AAAI Conferences

Consider the following problem: a human-robot team is actively In this paper, we explore the issues involved in engineering engaged in an urban search and rescue (USAR) scenario an automated planner to guide a robot towards maximizing inside a building of interest. The robot is placed inside net benefit accompanied with goal achievement in such this building, at the beginning of a long corridor; a sample scenarios. The planning problem that we face involves partial layout is presented in Figure 1. The human team member satisfaction (in that the robot has to weigh the rewards of has intimate knowledge of the building's layout, but is removed the soft goals against the cost of achieving them); it also requires from the scene and can only interact with the robot replanning ability (in that the robot has to modify its via on-board wireless audio communication. The corridor in current plan based on new goals that are added). An additional which the robot is located has doors leading off from either (perhaps more severe) complication is that the planner side into rooms, a fact known to the robot. However, unknown needs to handle goals involving objects whose existence is to the robot (and the human team member) is the possibility not known in the initial state (e.g., the location of the humans that these rooms may contain injured humans (victims).